AITopics | dialog policy

Collaborating Authors

dialog policy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Task-oriented Dialog Model with Task-progressive and Policy-aware Pre-training

Zhong, Lucen, Lu, Hengtong, Yuan, Caixia, Wang, Xiaojie, Sun, Jiashen, Zeng, Ke, Wan, Guanglu

arXiv.org Artificial IntelligenceOct-1-2023

Pre-trained conversation models (PCMs) have achieved promising progress in recent years. However, existing PCMs for Task-oriented dialog (TOD) are insufficient for capturing the sequential nature of the TOD-related tasks, as well as for learning dialog policy information. To alleviate these problems, this paper proposes a task-progressive PCM with two policy-aware pre-training tasks. The model is pre-trained through three stages where TOD-related tasks are progressively employed according to the task logic of the TOD system. A global policy consistency task is designed to capture the multi-turn dialog policy sequential relation, and an act-based contrastive learning task is designed to capture similarities among samples with the same dialog policy. Our model achieves better results on both MultiWOZ and In-Car end-to-end dialog modeling benchmarks with only 18% parameters and 25% pre-training data compared to the previous state-of-the-art PCM, GALAXY. We make our code and data publicly available.

dialog policy, pcm, proceedings, (12 more...)

arXiv.org Artificial Intelligence

2310.00597

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.47)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Why Guided Dialog Policy Learning performs well? Understanding the role of adversarial learning and its alternative

Shimoyama, Sho, Morimura, Tetsuro, Abe, Kenshi, Takamichi, Toda, Tomomatsu, Yuta, Sugiyama, Masakazu, Hentona, Asahi, Azuma, Yuuki, Ninomiya, Hirotaka

arXiv.org Artificial IntelligenceJul-13-2023

Dialog policies, which determine a system's action based on the current state at each dialog turn, are crucial to the success of the dialog. In recent years, reinforcement learning (RL) has emerged as a promising option for dialog policy learning (DPL). In RL-based DPL, dialog policies are updated according to rewards. The manual construction of fine-grained rewards, such as state-action-based ones, to effectively guide the dialog policy is challenging in multi-domain task-oriented dialog scenarios with numerous state-action pair combinations. One way to estimate rewards from collected data is to train the reward estimator and dialog policy simultaneously using adversarial learning (AL). Although this method has demonstrated superior performance experimentally, it is fraught with the inherent problems of AL, such as mode collapse. This paper first identifies the role of AL in DPL through detailed analyses of the objective functions of dialog policy and reward estimator. Next, based on these analyses, we propose a method that eliminates AL from reward estimation and DPL while retaining its advantages. We evaluate our method using MultiWOZ, a multi-domain task-oriented dialog corpus.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2307.06721

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Improving Proactive Dialog Agents Using Socially-Aware Reinforcement Learning

Kraus, Matthias, Wagner, Nicolas, Riekenbrauck, Ron, Minker, Wolfgang

arXiv.org Artificial IntelligenceJun-22-2023

The next step for intelligent dialog agents is to escape their role as silent bystanders and become proactive. Well-defined proactive behavior may improve human-machine cooperation, as the agent takes a more active role during interaction and takes off responsibility from the user. However, proactivity is a double-edged sword because poorly executed pre-emptive actions may have a devastating effect not only on the task outcome but also on the relationship with the user. For designing adequate proactive dialog strategies, we propose a novel approach including both social as well as task-relevant features in the dialog. Here, the primary goal is to optimize proactive behavior so that it is task-oriented - this implies high task success and efficiency - while also being socially effective by fostering user trust. Including both aspects in the reward function for training a proactive dialog agent using reinforcement learning showed the benefit of our approach for more successful human-machine cooperation.

machine learning, natural language, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3565472.3595611

2211.15359

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Washington > King County > Redmond (0.04)
North America > United States > Massachusetts (0.04)
(4 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

An Asynchronous Updating Reinforcement Learning Framework for Task-oriented Dialog System

Zhang, Sai, Hu, Yuwei, Wang, Xiaojie, Yuan, Caixia

arXiv.org Artificial IntelligenceMay-4-2023

Reinforcement learning has been applied to train the dialog systems in many works. Previous approaches divide the dialog system into multiple modules including DST (dialog state tracking) and DP (dialog policy), and train these modules simultaneously. However, different modules influence each other during training. The errors from DST might misguide the dialog policy, and the system action brings extra difficulties for the DST module. To alleviate this problem, we propose Asynchronous Updating Reinforcement Learning framework (AURL) that updates the DST module and the DP module asynchronously under a cooperative setting. Furthermore, curriculum learning is implemented to address the problem of unbalanced data distribution during reinforcement learning sampling, and multiple user models are introduced to increase the dialog diversity. Results on the public SSD-PHONE dataset show that our method achieves a compelling result with a 31.37% improvement on the dialog success rate. The code is publicly available via https://github.com/shunjiu/AURL.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICASSP49357.2023.10096940

2305.02718

Country:

Asia > China > Beijing > Beijing (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Doc2Bot: Accessing Heterogeneous Documents via Conversational Bots

Fu, Haomin, Zhang, Yeqin, Yu, Haiyang, Sun, Jian, Huang, Fei, Si, Luo, Li, Yongbin, Nguyen, Cam-Tu

arXiv.org Artificial IntelligenceNov-19-2022

This paper introduces Doc2Bot, a novel dataset for building machines that help users seek information via conversations. This is of particular interest for companies and organizations that own a large number of manuals or instruction books. Despite its potential, the nature of our task poses several challenges: (1) documents contain various structures that hinder the ability of machines to comprehend, and (2) user information needs are often underspecified. Compared to prior datasets that either focus on a single structural type or overlook the role of questioning to uncover user needs, the Doc2Bot dataset is developed to target such challenges systematically. Our dataset contains over 100,000 turns based on Chinese documents from five domains, larger than any prior document-grounded dialog dataset for information seeking. We propose three tasks in Doc2Bot: (1) dialog state tracking to track user intentions, (2) dialog policy learning to plan system actions and contents, and (3) response generation which generates responses based on the outputs of the dialog policy. Baseline methods based on the latest deep learning models are presented, indicating that our proposed tasks are challenging and worthy of further research.

information, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2210.1106

Country:

Asia > Middle East > Republic of Türkiye (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

LAD: Language Models as Data for Zero-Shot Dialog

Mehri, Shikib, Altun, Yasemin, Eskenazi, Maxine

arXiv.org Artificial IntelligenceJul-28-2022

However, fine-tuning can be impractical dialog remains elusive. A likely reason for this (e.g., in academic settings) with large LMs (e.g., discrepancy is that dialog models require significant GPT-3) due to the cost, computational power and data because they need to learn task-specific immutable architectures. To this end, this paper structural constraints, such as the domain ontology aims to address the following: 'How can we leverage and the dialog policy. While large language the strong language understanding and generation models (e.g., GPT-3) exhibit strong language understanding abilities of large LMs to facilitate zero-shot and generation abilities (Brown et al., generalization in task-oriented dialog?' 2020), they have no a priori knowledge of the Given the in-context meta-learning abilities of structural constraints implied by a specific (unseen) large LMs (Brown et al., 2020), prior work has problem setting (e.g., relevant intents, dialog policy, explored prompt-engineering or prompt-tuning etc.). As such, in order to adapt a pre-trained (Reynolds and McDonell, 2021; Lester et al., 2021; LM for task-oriented dialog, it is necessary to impose Madotto et al., 2021). Well-designed prompts can structural constraints on the unstructured convey the necessary structural constraints.

dialog, prediction, structural constraint, (14 more...)

arXiv.org Artificial Intelligence

2207.14393

Country:

North America > United States > New York (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.82)

Industry: Consumer Products & Services (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Schema-Guided Paradigm for Zero-Shot Dialog

Mehri, Shikib, Eskenazi, Maxine

arXiv.org Artificial IntelligenceJun-13-2021

Developing mechanisms that flexibly adapt dialog systems to unseen tasks and domains is a major challenge in dialog research. Neural models implicitly memorize task-specific dialog policies from the training data. We posit that this implicit memorization has precluded zero-shot transfer learning. To this end, we leverage the schema-guided paradigm, wherein the task-specific dialog policy is explicitly provided to the model. We introduce the Schema Attention Model (SAM) and improved schema representations for the STAR corpus. SAM obtains significant improvement in zero-shot settings, with a +22 F1 score improvement over prior work. These results validate the feasibility of zero-shot generalizability in dialog. Ablation experiments are also presented to demonstrate the efficacy of SAM.

dialog policy, representation, schema graph, (12 more...)

arXiv.org Artificial Intelligence

2106.07056

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > India (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Resource Constrained Dialog Policy Learning via Differentiable Inductive Logic Programming

Zhou, Zhenpeng, Beirami, Ahmad, Crook, Paul, Shah, Pararth, Subba, Rajen, Geramifard, Alborz

arXiv.org Artificial IntelligenceNov-10-2020

Motivated by the needs of resource constrained dialog policy learning, we introduce dialog policy via differentiable inductive logic (DILOG). We explore the tasks of one-shot learning and zero-shot domain transfer with DILOG on SimDial and MultiWoZ. Using a single representative dialog from the restaurant domain, we train DILOG on the SimDial dataset and obtain 99 % in-domain test accuracy. We also show that the trained DILOG zero-shot transfers to all other domains with 99 % accuracy, proving the suitability of DILOG to slot-filling dialogs. We further extend our study to the MultiWoZ dataset achieving 90 % inform and success metrics. We also observe that these metrics are not capturing some of the shortcomings of DILOG in terms of false positives, prompting us to measure an auxiliary Action F1 score. We show that DILOG is 100x more data efficient than state-of-the-art neural approaches on MultiWoZ while achieving similar performance metrics. We conclude with a discussion on the strengths and weaknesses of DILOG.

arxiv preprint arxiv, dilog, food pref, (13 more...)

arXiv.org Artificial Intelligence

2011.05457

Genre: Research Report (0.50)

Industry: Consumer Products & Services > Restaurants (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

User Intention Recognition and Requirement Elicitation Method for Conversational AI Services

Tian, Junrui, Tu, Zhiying, Wang, Zhongjie, Xu, Xiaofei, Liu, Min

arXiv.org Artificial IntelligenceSep-3-2020

In recent years, chat-bot has become a new type of intelligent terminal to guide users to consume services. However, it is criticized most that the services it provides are not what users expect or most expect. This defect mostly dues to two problems, one is that the incompleteness and uncertainty of user's requirement expression caused by the information asymmetry, the other is that the diversity of service resources leads to the difficulty of service selection. Conversational bot is a typical mesh device, so the guided multi-rounds Q$\&$A is the most effective way to elicit user requirements. Obviously, complex Q$\&$A with too many rounds is boring and always leads to bad user experience. Therefore, we aim to obtain user requirements as accurately as possible in as few rounds as possible. To achieve this, a user intention recognition method based on Knowledge Graph (KG) was developed for fuzzy requirement inference, and a requirement elicitation method based on Granular Computing was proposed for dialog policy generation. Experimental results show that these two methods can effectively reduce the number of conversation rounds, and can quickly and accurately identify the user intention.

machine learning, natural language, question answering, (19 more...)

arXiv.org Artificial Intelligence

2009.01509

Country: Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.67)
(2 more...)

Add feedback

Variational Reward Estimator Bottleneck: Learning Robust Reward Estimator for Multi-Domain Task-Oriented Dialog

Park, Jeiyoon, Lee, Chanhee, Kim, Kuekyeng, Lim, Heuiseok

arXiv.org Artificial IntelligenceMay-30-2020

Despite its notable success in adversarial learning approaches to multi-domain task-oriented dialog system, training the dialog policy via adversarial inverse reinforcement learning often fails to balance the performance of the policy generator and reward estimator. During optimization, the reward estimator often overwhelms the policy generator and produces excessively uninformative gradients. We proposes the Variational Reward estimator Bottleneck (VRB), which is an effective regularization method that aims to constrain unproductive information flows between inputs and the reward estimator. The VRB focuses on capturing discriminative features, by exploiting information bottleneck on mutual information. Empirical results on a multi-domain task-oriented dialog dataset demonstrate that the VRB significantly outperforms previous methods.

computational linguistic, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2006.00417

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New York > Monroe County > Rochester (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(9 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback